Constructing an object in a safe way is not easy:
    it is well known that dynamic dispatch
    or leaking \this during object construction
    is error-prone~\cite{Dean:1996,Seo:2007:SBD:1522565.1522587,Gil:2009:WRS:1615184.1615216},
    and various type systems and verifiers have been proposed to
    handle safe object initialization~\cite{Hubert:2010:ESO:1888881.1888890,Zibin:2010:OIG:1869459.1869509,Fahndrich:2007:EOI:1297027.1297052,XinQi:2009}.
As languages become more and more complex,
    new pitfalls are created due to the interactions among
    language features.

X10 is an object oriented programming language with a sophisticated
    type system (constraints, class invariants, non-erased generics, closures)
    and concurrency constructs (asynchronous activities, multiple places\removeGlobalRef{, global references}).
This paper shows that object initialization is a cross-cutting concern
    that interacts with other features in the language.
We discuss several language designs that restrict these interactions,
    and explain why we chose the \emph{hardhat} design for X10.

{Hardhat}~\cite{Gil:2009:WRS:1615184.1615216}
    is a design that prohibits dynamic dispatch
    or leaking \this (e.g., storing \this in the heap) during construction.
Such a design limits the user
    but also protects her from future bugs
    (see \Ref{Figure}{TwoPitfalls} below for two such bugs).
X10's hardhat design is more complex due to additional language features
    such as concurrency, places, and closures.

On the other end of the spectrum,
    Java and C\# allow
    dynamic dispatch and leaking \this.
However, they still maintain type and runtime safety
    by relying on the fact that every type has a default value
    (also called zero value, which is either 0, \code{false}, or \code{null}),
    and all fields are zero-initialized before the constructor begins.
As a consequence,
    a half-baked object can leak before all its fields are set. %\cite{Seo:2007:SBD:1522565.1522587} - reading uninitialized field references
Phrased differently,
    when reading a final field, one can read the default value initially and later read a different value.
Another source of subtle bugs is related to the synchronization barrier
    at the end of a constructor~\cite{JSR133}
    after which all assignments to final fields are guaranteed to be written.
The programmer is warned (in the documentation only!)
    that immutable objects (using final fields) are thread-safe only if
    \this does not escape its constructor.
%Before JSR 133~\cite{JSR133},
%    immutable objects could have different values in different threads
%    if synchronization was not used properly.
Finally, if the type-system is augmented, for example, with non-null types, then
    a default value no longer exists,
    which leads to complicated type-systems for initialization~\cite{Fahndrich:2007:EOI:1297027.1297052,XinQi:2009}.

\mbox{C++} sacrifices type-safety on the altar of performance: %as usual, gives you enough rope to hang yourself, but in an amazing speed.
    fields are not zero-initialized.
    (X10 has both type-safety and the performance for not zero-initializing fields.)
Therefore if \this leaks in \mbox{C++},
    one can read an uninitialized field resulting in an arbitrary value.
Moreover, method calls are statically bound during construction,
    which may result in an exception at runtime
    if one tries to invoke a virtual method of an abstract class (see \Ref{Figure}{Escaping-this}b below).
(Determining whether this happens is intractable~\cite{Gil:1998:CTA:646155.679689}.)
% todo: to reduce space
We believe a design for object initialization should have these
    desirable properties:
\begin{description}
%  \item[Reduce runtime errors]
%    Common errors should be caught at compile time.
%    \mbox{C++} pure virtual method error
%  \item[No security errors]
%    Java exception hole.

  \item[Cannot read uninitialized fields]
    One should not be able to read uninitialized fields.
    In \mbox{C++} it is possible to read uninitialized fields,
        returning an unspecified value which can lead to unpredictable behavior.
    In Java, fields are zero initialized before the constructor begins to execute,
        so it is possible to read the default or zero value,
        but never an unspecified value.

  \item[Single value for final fields]
    Final fields can be assigned exactly once, and
        should be read only after assigned.
    In Java it is possible to read a final field before it was assigned,
        therefore returning its default value.

  \item[Immutable objects are thread-safe]
    Immutable classes are a common pattern where fields are final/const
        and instances have no mutable state, e.g., \code{String} in Java.
    Immutable objects are often shared among threads without any explicit synchronization,
        because programmers assume that if another thread gets a handle to an object, then
        that thread should see all assignments done during initialization.
    However, weak memory models today do not necessarily have this guarantee
        and immutable objects could be thread-\emph{unsafe}!
    \Ref{Section}{FinalFields} below will show that this can happen in Java
        if \this escapes from the constructor~\cite{JSR133}.

%  \item[Parallel]
%    Concurrent and distributed code is error-prone to begin with.
%    The design should prevent initialization-related errors.

  \item[Simple]
    The order of initialization should be clear from the syntax, % intuitive,
        and should not surprise the user.
    Dynamic dispatch during construction disrupts the order
        of initialization by executing a subclass's method before the superclass finished its initialization.
    This kind of initialization order is error-prone and often surprises the user.

  \item[Flexible]
    The user should be able to express the common idioms
        found in other languages with minor variations.

  \item[Type-safe]
    The language should continue to be statically type-safe even
        if it has rich types that do not have a default or zero value,
        such as non-null types (\code{T\lb{}self!=null\rb} in X10's syntax).
    Type-safety implies that reading from a non-null type should never return \code{null}.
    Adding non-null types to Java~\cite{Fahndrich:2003:DCN:949305.949332,Fahndrich:2007:EOI:1297027.1297052,XinQi:2009}
        has been a challenge precisely due to
        Java's relaxed initialization rules.
\end{description}

We took the ideas of prohibiting dynamic dispatch or leaking \this during construction from~\cite{Gil:2009:WRS:1615184.1615216},
    and materialized them into a set of rules that cover all aspects of X10
    (type-system, closures, generics, properties, and concurrent and distributed constructs).
This hardhat design in X10 (version 2.2) has the above desirable properties,
    however they come at a cost of limiting flexibility:
    it is not possible to express cyclic immutable structures in X10.
We chose simplicity over flexibility in our design choices, e.g.,
    X10 prohibits creating an alias of \this during object construction
    (whereas a more flexible design could track aliases via alias-analysis, at the cost of sacrificing simplicity).
%Alternative designs for initialization are described in \Ref{Section}{designs},
%    such as the \code{proto} design (which was part of X10 version 2.0) that allows cyclic immutable structures
%    at the cost of a more complicated design.
To our knowledge, X10 is the first object-oriented (OO) language to adopt the strict hardhat initialization design.

Because one cannot read uninitialized fields in X10, there is no need zero-initialize the object's fields (as done in Java before the constructor executes).
A recent study~\cite{Yang:2011:WNM:2048066.2048092} measured the direct cost of \emph{zero initialization}, 
	which ``is surprisingly high: up to 12.7\%, with average costs ranging from 2.7 to 4.5\% on a high performance virtual machine on IA32 architectures."
(Note that the indirect costs due to caching might even be higher.)

X10 version 2.0 till 2.2 had an alternative initialization design called \emph{proto} that allowed cyclic immutable structures
    at the cost of a more complicated design.
In OOPSLA'11, Summers and M{\"u}ller~\cite{Summers:Mulller:2011}
	presented an initialization type-system that is almost identical to our proto proposal,
	but with different terminology:
	fully initialized objects are termed \emph{committed} (non-proto),
	and objects under initialization are termed \emph{free} (proto).
%Summers' type-system focused on non-null types in a Java-like language where all types may contain null.
Whereas in our proto proposal one cannot read uninitialized fields,
	in Summers' type-system reading uninitialized fields is allowed and it returns an \emph{unclassified} type
	(and reading non-null fields from an unclassified type is allowed but it may return null).
Phrased differently, reading a field is always allowed, but it may return null even for non-null fields.
In contrast, X10 and \mbox{C++} have types that cannot contain null 
	(in \mbox{C++} only pointers may be null, and X10 has \emph{structs}
		which are inlinable objects that may not contain null),
	thus Summers' type-system is not applicable in such languages.
Moreover, proto was used in X10 in the past 3 years prior to X10 2.2, 
	and it was made obselete in favor of the hardhat design presented in this paper
	because proto did not work well in practice.
For example, consider an implementation of \code{LinkedList} that has a non-null field 
	\code{header}:
\vspace*{-1.1ex}\begin{lstlisting}
class LinkedList extends SuperClass {
  final Entry header = new Entry();
  LinkedList(Collection c) { addAll(c); }
  public boolean addAll(Collection c) { 
    // this.header is null if SuperClass calls addAll in its constructor.
  }
}
\end{lstlisting}\vspace*{-1.1ex}
During construction we must read from \code{this.header}, but \code{this} is still proto
	and it is illegal to read from a proto object 
	(in Summer's type-system, reading is allowed but it may return null).
It is tempting to think that a dataflow algorithm can prove that \code{header} was
	already assigned, however the dataflow must be inter-procedural, and it is further complicated by overriding and \this escaping.
For example, if \code{SuperClass} calls \code{addAll} in its constructor,
	then \code{header} will still have the default value of \code{null}.
The newly proposed hardhat design can modularly type-check this code assuming that \code{addAll} 
	is annotated as non-escaping (see \Ref{Section}{rules}).

The \emph{contributions of this paper} are:
    (i)~a complete and strict hardhat design in a
        full-blown advanced OO language with many cross-cutting concerns
        (default values, final fields, dataflow analysis, overriding, and
			especially the concurrent and distributed aspects),
	(ii)~an inter-procedural fixed-point algorithm for definite-async assignment,
%    (iii)~discussing alternative designs, such as the proto design,
    (iii)~implementation inside the X10 open-source compiler and
        converting the entire X10 code-base (+200K lines of code) %according to Igor
        to conform to the hardhat principles,
    (iv)~FX10 formalism which is the first to present a flow-sensitive effect system with concurrency constructs
    and a soundness theorem stating that one can never read an
    uninitialized field in a statically correct  program.

For object initialization rules, the details matter. 
Instead of basing our work on abstract theoretical discussions, 
we have chosen to work with a concrete language (X10, see \Ref{Section}{implementation}) in which all these rules have been worked out 
to illustrate the subtleties involved. 
Our analysis and design will be applicable to any OO language with fine-grained concurrency. 
Object initialization rules must be dealt with in order to support determinate computation. 
For example, \emph{Deterministic Parallel Java} (DPJ)~\cite{Bocchino:2009:TES:1640089.1640097} also have similar rules for object initialization
	to prevent \this from leaking: ``the DPJ type and effect system ensures that no other task can access \this until after the constructor returns."	

The remainder of this introduction 
    presents common initialization pitfalls (in sequential, concurrent, and distributed code in both Java and X10)
    and how the hardhat design prevents them.
Specifically, 
    it presents
    initialization pitfalls in sequential code (\Ref{Section}{Initialization-pitfalls}),
    concurrent code (\Ref{Section}{FinalFields}),
    and
    distributed X10 code (\Ref{Section}{Parallelism}),
    and
    the crux of the hardhat design that prevents these sequential pitfalls (\Ref{Section}{hardhat-crux}).

\subsection{Initialization pitfalls in sequential code}
\label{Section:Initialization-pitfalls}

\Ref{Figure}{TwoPitfalls}a demonstrates the two most common initialization pitfalls in Java:
    leaking \this and dynamic dispatch.
We will first explain the surprising output due to dynamic dispatch,
    and then the less known possible bug due to leaking \this.

\begin{figure}
\vspace*{-1.1ex}
\subfloat[Initialization pitfalls]{\lstinputlisting[boxpos=t]{seq-pitfall.txt}}
\subfloat[Fixed to conform to the hardhat design]{\lstinputlisting[boxpos=t]{seq-fix.txt}}
\vspace*{-1.1ex}
\caption{Two initialization pitfalls in Java:
    leaking \this and dynamic dispatch.}
\vspace*{-1.1ex}
\label{Figure:TwoPitfalls}
\end{figure}


Executing \code{new B()} prints \code{a=42,b=0}, which is
    surprising to most Java users.
One would expect \code{b} to be 2, and \code{a} to be either 1 or 44.
However, due to initialization order and dynamic dispatch,
    the user sees the default value for \code{b} which is 0,
    and therefore the value of \code{a} is 42.
We will trace the initialization order for \code{new B()}:
    we first allocate a new object with zero-initialized fields,
    and then invoke the constructor of \code{B}.
The constructor of \code{B} first calls \code{super()},
    and only afterward it will run the field initializer which sets \code{b} to 2.
This is the cause of surprise, because \emph{syntactically} the field initializer comes before
    \code{super()},
    however it is executed after.
(And writing \code{b=2;super();} is illegal in Java because
    calling super must be the first statement).
During the \code{super()} call we perform two dynamic dispatches:
    the two calls (\code{initA()} and \code{toString()})
    execute the implementation in \code{B} (and recall that \code{b} is still 0).
Therefore, \code{initA()} returns 42, and \code{toString()} returns \code{a=42,b=0}.
This bug might seem pretty harmless,
    however if we change the type of \code{b} from \code{int} to \code{Integer},
    then this code will throw a \code{NullPointerException},
    which is more severe.

The second pitfall is leaking \this before the object is fully-initialized,
    for example, \code{S.add(this)}.
Note that we leak a partially-initialized object, i.e.,
    the fields of \code{B} have not yet been assigned and they contain their default values.
Suppose that some other thread iterates over \code{S} and prints them.
Then that thread might read \code{b=0}.
In fact, it might even read \code{a=0}, even though we just assigned 42 to \code{a} two statements ago!
The reason is that this write is guaranteed to be seen by other threads only
    after an implicit synchronization barrier that is executed after the constructor ends.
\Ref{Section}{FinalFields} further explains final fields in Java and the implicit synchronization barrier.


\subsection{The crux of the hardhat design}
\label{Section:hardhat-crux}
The hardhat design in X10 (described in \Ref{Section}{rules})
    prevents both pitfalls,
    because its rules allow dynamic dispatching only when \this cannot be accessed (first pitfall)
    and prohibit leaking \this (second pitfall).
We use two method annotations to mark that a method is non-escaping: \code{@NonEscaping} and \code{@NoThisAccess};
    the first prohibits leaking \this,
    and the second is even more strict and prohibits any access of \this.
The essence of the hardhat design are these two rules:
    (i)~Constructors and non-escaping methods may leak/alias \this \emph{only} to other non-escaping methods
        (i.e., \this can only be used as the receiver of a non-escaping method call),
    (ii)~Non-escaping methods are either private or final (thus they cannot be overridden),
        except \code{@NoThisAccess} methods that may be overridden but they cannot access \this.
These two rules prevent the two pitfalls of leaking \this and dynamic dispatching.

Initialization in X10 has the following main attributes: % (\Ref{Section}{Virtues} revisits these attributes after the rules are given):
(i)~\this is the only accessible raw/uninitialized object in scope,
(ii)~only \code{@NoThisAccess} methods can be dynamically dispatched during construction,
(iii)~one can read a field \emph{only} after it was assigned, and all fields are assigned by the time the constructor finishes,
(iv)~reading a final field always results in the same value.
(In contrast to Java and~\cite{Summers:Mulller:2011} where reading a final field might return different values at different times.)
Furthermore, with the hardhat rules there is even no need to \emph{zero-initialize all fields} before executing the constructor (as done~in~Java),
    thus reducing the program runtime.
(We are now in the process of measuring this reduction in runtime; Using a simple bytecode verifier it is possible to ensure that this optimization is safe.)


\Ref{Figure}{TwoPitfalls}b shows how to convert the code of \Ref{Figure}{TwoPitfalls}a to the hardhat design
    and thus avoid these two pitfalls (but the original program behavior is changed).
We made the following changes:
    (i)~\code{toString} now delegates to a final non-escaping method \code{toStringOfA},
    and the constructor of \code{A} can call \code{toStringOfA};
    \code{B} cannot override this method because it is final,
    (ii)~\code{initA} is \code{@NoThisAccess} and therefore \code{B.initA} cannot read the field \code{b} (which has not been assigned yet),
    (iii)~instead of leaking \this into \code{S} in the constructor of \code{A},
        we refactored the code into two factory methods that create instances of \code{A} and \code{B},
        and only then add the fully-initialized instance to \code{S}.



\subsection{Initialization pitfalls in concurrent code}
\label{Section:FinalFields}
We will start with an anecdote:
    suppose you have a friend that
    playfully removed all the occurrences of the \code{final} keyword
    from your legal Java program.
Would your program still \emph{run} the same?
On the face of it, \code{final} is used only to make the \emph{compiler} more {strict},
    i.e., to catch more errors at compile time
    (to make sure a method is not overridden, a class is not extended, and a field or local
        is assigned exactly once).
After \emph{compilation} is done, \code{final} should not change the \emph{runtime} behavior of the program.
However, this is not the case due to interaction between initialization and concurrency:
    a synchronization barrier is implicitly added
    at the end of a constructor~\cite{JSR133}
    ensuring that assignments to \emph{final} fields are visible to all other threads.
(Assignments to non-final fields might not be visible to other threads!)

The synchronization barrier was added to the memory model of Java 5
    to ensure that the common pattern of immutable objects is thread-safe.
The memory model does not guarantee \emph{sequential consistency}, but only \emph{weak consistency}.
(The barrier would not be needed with sequential consistency.)
Without this barrier another thread might see the default value of a field
    instead of its final value.
For example, it is well-known that \code{String} is immutable in Java,
    and its implementation uses three {final} fields:
    \code{char[] value}, and two \code{int} fields named \code{offset} and \code{count}.
The following code
    ~~\code{"AB".substring(1)}\\
will return a new string \code{"B"}
    that shares
    the same \code{value} array as \code{"AB"}, %\lb'A','B'\rb
    but with \code{offset} and \code{count} equal to 1.
Without the barrier, %or without the use of final fields in \code{String},
    another thread might see the default values for these three fields,
    i.e., \code{null} for \code{value} and 0 for \code{offset} and \code{count}.
For instance,
    if one removes the \code{final} keyword from all three fields in \code{String},
    then
    the following code might print \code{B} (the expected answer),
    or it might print
    \code{A} or an empty string,
    or might even throw a \code{NullPointerException}:
\begin{lstlisting}
final String name = "AB".substring(1);
new Thread() { public void run() {
    System.out.println(name); } }.start();
\end{lstlisting}

A similar bug might happen in \Ref{Figure}{TwoPitfalls}a
    because \this was leaked into \code{S} before the barrier was executed.
Consider another thread that iterates over \code{S} and reads field \code{a}.
It might read 0, because the assignment of 42 to \code{a} is guaranteed to be visible to other threads
    only after the barrier was reached.

Java's documentation recommends using final fields
    when creating an immutable class,    
    and avoid leaking \this in the constructor.
However, \code{javac} does not even give a warning if that recommendation is violated.
To summarize, final fields in Java
    enable thread-safe immutable objects,
    but the user must be careful to avoid the pitfall of leaking \this.
The hardhat design in X10 prevents any leakage of \this,
    thus making it safe and easy to create immutable classes.


\subsection{Initialization pitfalls in distributed X10 code}
\label{Section:Parallelism}
X10 supports parallelism in the form of both concurrent and distributed code.
Next we describe parallelism in X10 and its interaction with object initialization.

\emph{Concurrent code} uses asynchronous un-named activities that are created with the \code{async} construct,
    and it is possible to wait for activities to complete with the \code{finish} construct.
Informally, statement \code{async S} executes statement \code{S} asynchronously;
    we say that the newly created activity \emph{locally terminated} when it finished executing \code{S},
        and that it \emph{globally terminated} when it locally terminated and any activity spawned by \code{S}
            also globally terminated.
Statement \code{finish S} blocks until all the activities created by \code{S} globally terminated.

\emph{Distributed code} is run over multiple \emph{places} that do \emph{not} share memory,
    therefore objects are (deeply) copied from one place to another.
The expression \code{at(p) E} evaluates \code{p} to a place, then copies all captured references in \code{E} to place \code{p},
    then evaluates \code{E} in place \code{p}, and finally copies the result back to the original place.
Note that \code{at} is a synchronous construct, meaning that the current activity is blocked until the \code{at} finishes.
This construct can also be used as a statement, in which case there is no copy back
    (but there is still a notification that is sent back when the \code{at} finishes, in order to release the blocked activity in the original place).
\removeGlobalRef{
\Ref{Section}{Global-references} describes how \emph{global references} are used in distributed code
    to allow objects in one place to point to objects in another place.
}



\begin{figure}
\vspace*{-1.1ex}
\subfloat[Initialization pitfalls in X10]{\lstinputlisting[boxpos=t]{fib_wrong.txt}}
\quad
\subfloat[Fixed to conform to the hardhat design]{\lstinputlisting[boxpos=t]{fib_fixed.txt}}
\vspace*{-1.1ex}
\caption{Concurrent and distributed Fibonacci example in X10.
    Concurrent code is expressed using \code{async} and \code{finish}:
        \code{async} starts an asynchronous activity,
        and \code{finish} waits for all spawned activities to finish.
    Distributed code uses \code{at} to shift among
        \emph{places};
        \code{here} denotes the current place.
    \code{at(p) E} evaluates expression \code{E}
        in place \code{p}, and finally copies the result back;
        any final variables captured in \code{E} from
        the outer environment (e.g., \code{n})
        are first copied to place \code{p}.
    The two initialization pitfalls:
        (1) %Distributed bug:
            write to field \code{\this.fib2} in another place, 
            which causes (an uninitialized) \this to be copied to \code{p},
            so one writes to a \emph{copy} of \this 
            (and the original object is never fully initialized!),
        (2) %Concurrency bug:
            read from \code{fib2} before its write definitely \emph{finished}.
    }
\label{Figure:DistributedFib}
\end{figure}

\Ref{Figure}{DistributedFib}a (and \Ref{Figure}{DistributedFib}b) shows how to (correctly) calculate the Fibonacci number \code{fib(n)} in X10
    using concurrent and distributed code.
The keywords \code{val} and \code{var} are modifiers that correspond to final and non-final variables, respectively.
Note how \code{fib(n-2)} is calculated asynchronously at the next place (\code{next()} returns the next place in a cyclic ordering of all places),
    while simultaneously recursively calculating \code{fib(n-1)} in the current place (that will recursively spawn a new activity, and so on).
Therefore, the computation will recursively continue to spawn activities at the next place until \code{n} is 1.
When both calculations globally terminate, the \code{finish} unblocks,
    and we sum their result into the \emph{final} field \code{fib}.

We note that using \emph{final local variables} for \code{fib2} and \code{fib1} instead of fields
    would have made this example
    more elegant, however we chose the latter because this paper focuses on \emph{object} initialization.
X10 has similar initialization rules for final locals and final fields,
    but it is outside the scope of this paper to present all forms of initialization in X10
    (including local variables and static fields).
Details of those can be found in X10's language specification at \code{x10-lang.org}.

There are two possible pitfalls in this example.
The first is a distributed pitfall, where
    one assigns to a field of a copy of \this in another place (instead of assigning in the original place).
Leaking \this to another place before it is fully initialized
    might also cause bugs in custom serialization code (see \Ref{Section}{Multiple-Places}).
The second is a concurrency pitfall, where we forget to use \code{finish},
    and therefore we might read from a field before its assignment was definitely executed.
Java has definite-assignment rules (using an intra-procedural data-flow analysis)
    to ensure that a read can only happen after a write;
    The hardhat design in X10 adopted those rules
        and extended them in the face of concurrency to support the pattern of
            \emph{asynchronous initialization}
            where an \code{async} must have an enclosing \code{finish}
    (using an intra-class inter-procedural analysis, see \Ref{Section}{Read-write-rules}).

The hardhat design in X10 prevents both pitfalls by ensuring that all fields of an object are 
	definitely-synchronously assigned when construction of that object ends,
	and that only fully initialized objects can cross places.



The rest of this paper is organized as follows.
\Ref{Section}{rules} presents the hardhat initialization rules of X10 version 2.2
    using examples,
    by slowly adding language features and describing their interaction with
    object initialization.
%\Ref{Section}{designs} describes alternative designs for object initialization
%    (one was implemented in X10 version 2.0 and another was under consideration for 2.2),
%    weighing the pros and cons of each.
\Ref{Section}{implementation} outlines our implementation within the X10 compiler using the polyglot framework,
    the compilation time overhead of checking these initialization rules,
    and the annotation overhead in our X10 code base.
%\Ref{Section}{case-study} investigates the consequences of having a hardhat design
%    in a case study that converted a large collection of Java classes to X10.
\Ref{Section}{formalism} presents Featherweight X10 (FX10), which is a formalization of core X10
    that includes \hfinish, \hasync, and flow-sensitive type-checking rules.
\Ref{Section}{related-work} summarizes previous work in the field of object initialization.
Finally, \Ref{Section}{conclusion} concludes.
